singleton value
161882dd2d19c716819081aee2c08b98-Supplemental.pdf
We restate the theoretical statements and the algorithms here for completeness and convenience. Given a normalized monotone submodular function f:2 V! The objective aims to find the partition such that the minimum-valued block in the partition is maximized. For a ground set V and its elements (v1,v2,...,v n) coming in an arbitrary streaming order, the output solution of Alg. 1 has The output solution of Alg. 2 has We construct a set cover function as the tight example for Corollary 1. We illustrate the set cover function graphically in Figure 1. The circles are the areas to cover for the set cover function and the green inner circles and the red triangles are elements in the ground set (the outer yellow circles are not elements). The inner circles (green) largely overlap with the outer circles (yellow).
The Permuted Striped Block Model and its Factorization -- Algorithms with Recovery Guarantees
Murray, Michael, Tanner, Jared
We introduce a novel class of matrices which are defined by the factorization $\textbf{Y} :=\textbf{A}\textbf{X}$, where $\textbf{A}$ is an $m \times n$ wide sparse binary matrix with a fixed number $d$ nonzeros per column and $\textbf{X}$ is an $n \times N$ sparse real matrix whose columns have at most $k$ nonzeros and are $\textit{dissociated}$. Matrices defined by this factorization can be expressed as a sum of $n$ rank one sparse matrices, whose nonzero entries, under the appropriate permutations, form striped blocks - we therefore refer to them as Permuted Striped Block (PSB) matrices. We define the $\textit{PSB data model}$ as a particular distribution over this class of matrices, motivated by its implications for community detection, provable binary dictionary learning with real valued sparse coding, and blind combinatorial compressed sensing. For data matrices drawn from the PSB data model, we provide computationally efficient factorization algorithms which recover the generating factors with high probability from as few as $N =O\left(\frac{n}{k}\log^2(n)\right)$ data vectors, where $k$, $m$ and $n$ scale proportionally. Notably, these algorithms achieve optimal sample complexity up to logarithmic factors.